In this notebook you will find:¶

  • You will see simple statistics for data in Global Terrorism Database. For example, central tendency, dispersion and shape of a dataset’s distribution.
  • You'll see a lot of graphics and analysis customized by using data on terrorist attacks. You will find reviews of these analyses and graphics.
  • You will see the comparisons made to achieve some results and the frequency of terrorist attacks according to the specified years.
  • You will see which type of attack is used most in terror attacks, and how many deaths each one of these types of attack has caused.
  • Finally, through all the data, you'll see which country and how many people were killed in terrorist attacks.

Introduction¶

According to the FBI, international terrorism is defined as "violent, criminal acts committed by individuals and/or groups who are inspired by, or associated with, designated foreign terrorist organizations or nations (state-sponsored)".

The purpose of this analysis is to discover what trends in the data there are and what it can tell us about global terrorism attacks in terms of where they occur, the types of terrorist attacks, what weapons were used, who the terrorist targets are, and who the largest terrorist groups are.

The variables of interest in this analysis are:

Year: Year the attack took place (1970-2017 is the range) Country: Country the terrorist attack took place in Region: Region the terrorist attack took place in City: City the terrorist attack took place in Attack Type: How the terrorist attacked the victim Weapon Type: Weapon used by terrorist to attack the victim Target: Who the target of this terrorist attack is Affiliation: What terrorist group is the terrorist part of

Data Processing¶

In [11]:
# pip install folium
In [87]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns # advanced visualization tool
import folium # visualize lat long in map
from folium.plugins import MarkerCluster 

#  Input data files are available in the directory.

from subprocess import check_output
In [88]:
df=pd.read_csv('globalterrorism (1).csv', encoding="ISO-8859-1")
C:\Users\Admin\AppData\Local\Temp\ipykernel_9084\4087428976.py:1: DtypeWarning: Columns (4,6,31,33,61,62,63,76,79,90,92,94,96,114,115,121) have mixed types. Specify dtype option on import or set low_memory=False.
  df=pd.read_csv('globalterrorism (1).csv', encoding="ISO-8859-1")
In [122]:
df
Out[122]:
eventid iyear imonth iday approxdate extended resolution country country_txt region ... addnotes scite1 scite2 scite3 dbsource INT_LOG INT_IDEO INT_MISC INT_ANY related
0 197000000001 1970 7 2 NaN 0 NaN 58 Dominican Republic 2 ... NaN NaN NaN NaN PGIS 0 0 0 0 NaN
1 197000000002 1970 0 0 NaN 0 NaN 130 Mexico 1 ... NaN NaN NaN NaN PGIS 0 1 1 1 NaN
2 197001000001 1970 1 0 NaN 0 NaN 160 Philippines 5 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
3 197001000002 1970 1 0 NaN 0 NaN 78 Greece 8 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
4 197001000003 1970 1 0 NaN 0 NaN 101 Japan 4 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
181686 201712310022 2017 12 31 NaN 0 NaN 182 Somalia 11 ... NaN "Somalia: Al-Shabaab Militants Attack Army Che... "Highlights: Somalia Daily Media Highlights 2 ... "Highlights: Somalia Daily Media Highlights 1 ... START Primary Collection 0 0 0 0 NaN
181687 201712310029 2017 12 31 NaN 0 NaN 200 Syria 10 ... NaN "Putin's 'victory' in Syria has turned into a ... "Two Russian soldiers killed at Hmeymim base i... "Two Russian servicemen killed in Syria mortar... START Primary Collection -9 -9 1 1 NaN
181688 201712310030 2017 12 31 NaN 0 NaN 160 Philippines 5 ... NaN "Maguindanao clashes trap tribe members," Phil... NaN NaN START Primary Collection 0 0 0 0 NaN
181689 201712310031 2017 12 31 NaN 0 NaN 92 India 6 ... NaN "Trader escapes grenade attack in Imphal," Bus... NaN NaN START Primary Collection -9 -9 0 -9 NaN
181690 201712310032 2017 12 31 NaN 0 NaN 160 Philippines 5 ... NaN "Security tightened in Cotabato following IED ... "Security tightened in Cotabato City," Manila ... NaN START Primary Collection -9 -9 0 -9 NaN

181691 rows × 135 columns

DATA INFO¶

In [27]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 181691 entries, 0 to 181690
Columns: 135 entries, eventid to related
dtypes: float64(55), int64(22), object(58)
memory usage: 187.1+ MB
In [121]:
df.columns
Out[121]:
Index(['eventid', 'iyear', 'imonth', 'iday', 'approxdate', 'extended',
       'resolution', 'country', 'country_txt', 'region',
       ...
       'addnotes', 'scite1', 'scite2', 'scite3', 'dbsource', 'INT_LOG',
       'INT_IDEO', 'INT_MISC', 'INT_ANY', 'related'],
      dtype='object', length=135)

DATA DESCRIBE¶

Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values

In [28]:
df.describe()
Out[28]:
eventid iyear imonth iday extended country region latitude longitude specificity ... ransomamt ransomamtus ransompaid ransompaidus hostkidoutcome nreleased INT_LOG INT_IDEO INT_MISC INT_ANY
count 1.816910e+05 181691.000000 181691.000000 181691.000000 181691.000000 181691.000000 181691.000000 177135.000000 1.771340e+05 181685.000000 ... 1.350000e+03 5.630000e+02 7.740000e+02 552.000000 10991.000000 10400.000000 181691.000000 181691.000000 181691.000000 181691.000000
mean 2.002705e+11 2002.638997 6.467277 15.505644 0.045346 131.968501 7.160938 23.498343 -4.586957e+02 1.451452 ... 3.172530e+06 5.784865e+05 7.179437e+05 240.378623 4.629242 -29.018269 -4.543731 -4.464398 0.090010 -3.945952
std 1.325957e+09 13.259430 3.388303 8.814045 0.208063 112.414535 2.933408 18.569242 2.047790e+05 0.995430 ... 3.021157e+07 7.077924e+06 1.014392e+07 2940.967293 2.035360 65.720119 4.543547 4.637152 0.568457 4.691325
min 1.970000e+11 1970.000000 0.000000 0.000000 0.000000 4.000000 1.000000 -53.154613 -8.618590e+07 1.000000 ... -9.900000e+01 -9.900000e+01 -9.900000e+01 -99.000000 1.000000 -99.000000 -9.000000 -9.000000 -9.000000 -9.000000
25% 1.991021e+11 1991.000000 4.000000 8.000000 0.000000 78.000000 5.000000 11.510046 4.545640e+00 1.000000 ... 0.000000e+00 0.000000e+00 -9.900000e+01 0.000000 2.000000 -99.000000 -9.000000 -9.000000 0.000000 -9.000000
50% 2.009022e+11 2009.000000 6.000000 15.000000 0.000000 98.000000 6.000000 31.467463 4.324651e+01 1.000000 ... 1.500000e+04 0.000000e+00 0.000000e+00 0.000000 4.000000 0.000000 -9.000000 -9.000000 0.000000 0.000000
75% 2.014081e+11 2014.000000 9.000000 23.000000 0.000000 160.000000 10.000000 34.685087 6.871033e+01 1.000000 ... 4.000000e+05 0.000000e+00 1.273412e+03 0.000000 7.000000 1.000000 0.000000 0.000000 0.000000 0.000000
max 2.017123e+11 2017.000000 12.000000 31.000000 1.000000 1004.000000 12.000000 74.633553 1.793667e+02 5.000000 ... 1.000000e+09 1.320000e+08 2.750000e+08 48000.000000 7.000000 2769.000000 1.000000 1.000000 1.000000 1.000000

8 rows × 77 columns

DATA CORRELATION¶

Correlation, gives the relation to the directly proportional and inversely proportional of data with each other. If the correlation is 0, the two properties are irrelevant

In [29]:
df.corr()
Out[29]:
eventid iyear imonth iday extended country region latitude longitude specificity ... ransomamt ransomamtus ransompaid ransompaidus hostkidoutcome nreleased INT_LOG INT_IDEO INT_MISC INT_ANY
eventid 1.000000 0.999996 0.002706 0.018336 0.091761 -0.135039 0.401371 0.166886 0.003907 0.030641 ... -0.009990 -0.018001 -0.014094 -0.165422 0.256113 -0.181612 -0.143600 -0.133252 -0.077852 -0.175605
iyear 0.999996 1.000000 0.000139 0.018254 0.091754 -0.135023 0.401384 0.166933 0.003917 0.030626 ... -0.009984 -0.018216 -0.014238 -0.165375 0.256092 -0.181556 -0.143601 -0.133253 -0.077847 -0.175596
imonth 0.002706 0.000139 1.000000 0.005497 -0.000468 -0.006305 -0.002999 -0.015978 -0.003880 0.003621 ... -0.000710 0.046989 0.058878 -0.016597 0.011295 -0.011535 -0.002302 -0.002034 -0.002554 -0.006336
iday 0.018336 0.018254 0.005497 1.000000 -0.004700 0.003468 0.009710 0.003423 -0.002285 -0.006991 ... 0.012755 -0.010502 0.003148 -0.006581 -0.006706 0.001765 -0.001540 -0.001621 -0.002027 -0.001199
extended 0.091761 0.091754 -0.000468 -0.004700 1.000000 -0.020466 0.038389 -0.024749 0.000523 0.057897 ... -0.008114 0.028177 0.001966 0.009367 0.233293 -0.192155 0.071768 0.075147 0.027335 0.080767
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
nreleased -0.181612 -0.181556 -0.011535 0.001765 -0.192155 -0.044331 -0.149511 0.002790 -0.017745 -0.030631 ... 0.054571 0.034843 0.049322 0.016832 -0.555478 1.000000 0.039388 0.040947 0.085055 0.064759
INT_LOG -0.143600 -0.143601 -0.002302 -0.001540 0.071768 0.069904 -0.082584 -0.099827 0.002272 0.073022 ... 0.035821 0.031079 0.007029 -0.045504 -0.015442 0.039388 1.000000 0.996211 0.052537 0.891051
INT_IDEO -0.133252 -0.133253 -0.002034 -0.001621 0.075147 0.067564 -0.071917 -0.094470 0.002268 0.071333 ... 0.039053 0.041983 0.013162 -0.039844 -0.016234 0.040947 0.996211 1.000000 0.082014 0.893811
INT_MISC -0.077852 -0.077847 -0.002554 -0.002027 0.027335 0.207281 0.043139 0.097652 0.000371 -0.019197 ... 0.023815 0.125162 0.037227 0.129274 -0.119776 0.085055 0.052537 0.082014 1.000000 0.252193
INT_ANY -0.175605 -0.175596 -0.006336 -0.001199 0.080767 0.153118 -0.047900 -0.041530 0.002497 0.061389 ... 0.028054 0.053484 0.007275 0.056438 -0.061946 0.064759 0.891051 0.893811 0.252193 1.000000

77 rows × 77 columns

In [30]:
df.head()
Out[30]:
eventid iyear imonth iday approxdate extended resolution country country_txt region ... addnotes scite1 scite2 scite3 dbsource INT_LOG INT_IDEO INT_MISC INT_ANY related
0 197000000001 1970 7 2 NaN 0 NaN 58 Dominican Republic 2 ... NaN NaN NaN NaN PGIS 0 0 0 0 NaN
1 197000000002 1970 0 0 NaN 0 NaN 130 Mexico 1 ... NaN NaN NaN NaN PGIS 0 1 1 1 NaN
2 197001000001 1970 1 0 NaN 0 NaN 160 Philippines 5 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
3 197001000002 1970 1 0 NaN 0 NaN 78 Greece 8 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
4 197001000003 1970 1 0 NaN 0 NaN 101 Japan 4 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN

5 rows × 135 columns

In [31]:
df.tail()
Out[31]:
eventid iyear imonth iday approxdate extended resolution country country_txt region ... addnotes scite1 scite2 scite3 dbsource INT_LOG INT_IDEO INT_MISC INT_ANY related
181686 201712310022 2017 12 31 NaN 0 NaN 182 Somalia 11 ... NaN "Somalia: Al-Shabaab Militants Attack Army Che... "Highlights: Somalia Daily Media Highlights 2 ... "Highlights: Somalia Daily Media Highlights 1 ... START Primary Collection 0 0 0 0 NaN
181687 201712310029 2017 12 31 NaN 0 NaN 200 Syria 10 ... NaN "Putin's 'victory' in Syria has turned into a ... "Two Russian soldiers killed at Hmeymim base i... "Two Russian servicemen killed in Syria mortar... START Primary Collection -9 -9 1 1 NaN
181688 201712310030 2017 12 31 NaN 0 NaN 160 Philippines 5 ... NaN "Maguindanao clashes trap tribe members," Phil... NaN NaN START Primary Collection 0 0 0 0 NaN
181689 201712310031 2017 12 31 NaN 0 NaN 92 India 6 ... NaN "Trader escapes grenade attack in Imphal," Bus... NaN NaN START Primary Collection -9 -9 0 -9 NaN
181690 201712310032 2017 12 31 NaN 0 NaN 160 Philippines 5 ... NaN "Security tightened in Cotabato following IED ... "Security tightened in Cotabato City," Manila ... NaN START Primary Collection -9 -9 0 -9 NaN

5 rows × 135 columns

FEATURES OF DATA¶

In [34]:
df.columns
Out[34]:
Index(['eventid', 'iyear', 'imonth', 'iday', 'approxdate', 'extended',
       'resolution', 'country', 'country_txt', 'region',
       ...
       'addnotes', 'scite1', 'scite2', 'scite3', 'dbsource', 'INT_LOG',
       'INT_IDEO', 'INT_MISC', 'INT_ANY', 'related'],
      dtype='object', length=135)

DATA VISUALIZATION

US TERROR ATTACKS DEATH AND INJURIES

In [95]:
#Line Plot
#kind = type of plot, color = color, label = label, linewidth = width of line, alpha = opacity, grid = grid, linestyle = sytle of line
df.nkillus.plot(kind = 'line', color = 'red', label = 'The Number of Total Confirmed Fatalities for US', linewidth = 2, alpha = 0.8, grid = True, 
                 linestyle = ':', figsize = (40,40), fontsize=15)
df.nwoundus.plot(color = "green", label = 'The Number of Confirmed Non-Fatal Injuries for US', linewidth = 2, alpha = 1, grid = True, 
                 linestyle = '-.', figsize = (20,20), fontsize=15)

plt.legend(loc='upper right',fontsize=15)     # legend = puts label into plot
plt.xlabel('Database Index', fontsize=30)              # label = name of label
plt.ylabel('Number of Dead or Injuries', fontsize=30)

plt.title('Confirmed Fatalities & Non-Fatal Injuries for US',fontsize=40)            #plot title
plt.show()

Given that the data is sorted by dates, attacks on US citizens seem to be very rare in a given date range. But the terrorist act against the citizens of US has been increasingly in the following year after this rare date range. By finding the date of the start of the increase, the factors in increasing terrorist acts can be easily identified by taking into account the changes and developments in the country after this date.

DEATH AND INJURIES AT ALL TIME¶

In [99]:
# Scatter Plot 
# Generally, is used to compare two different features.
# Right here, x = Target type, y = Success
df.plot(kind = 'scatter', x = 'nkill', y = 'nwound', alpha = 0.5, color = 'red', figsize = (40,40), fontsize=40)
plt.xlabel('Kill', fontsize=30)
plt.ylabel('Wound', fontsize=30)
plt.title('Kill - Wound Scatter Plot',fontsize=40)
plt.show()

In the majority of acts of terrorism, the mortality rate and injuries were low, but a small number of actions led to too many deaths and injuries

GRAPHICS AND ANALYSIS ON CUSTOMIZED DATA¶

TERRORIST ATTACKS OF A PARTICULAR YEAR AND THEIR LOCATIONS

Let's look at the terrorist acts in the world over a certain year.

In [100]:
filterYear = df['iyear'] == 1970 # filter the terrorist acts 
In [41]:
filterData = df[filterYear] # filter data
# filterData.info()
reqFilterData = filterData.loc[:,'city':'longitude'] #We are getting the required fields
reqFilterData = reqFilterData.dropna() # drop NaN values in latitude and longitude
reqFilterDataList = reqFilterData.values.tolist()
# reqFilterDataList
In [101]:
# map: location = camera location, zoom_start = initial zoom size, tiles = map background
# marker: location = marker location, popup = popup message(str)
map = folium.Map(location = [0, 30], tiles='CartoDB positron', zoom_start=4)
# clustered marker
markerCluster = folium.plugins.MarkerCluster().add_to(map)
for point in range(0, len(reqFilterDataList)):
    folium.Marker(location=[reqFilterDataList[point][1],reqFilterDataList[point][2]], popup = reqFilterDataList[point][0]).add_to(markerCluster)
map
Out[101]:
Make this Notebook Trusted to load map: File -> Trust Notebook

84% of the terrorist attacks in 1970 were carried out on the American continent. In 1970, the Middle East and North Africa, currently the center of wars and terrorist attacks, faced only one terrorist attack.

TOTAL NUMBER OF PEOPLE KILLED IN TERROR ATTACK

According to records, the total number of people killed in terrorist attacks

In [46]:
killData = df.loc[:,'nkill']
print('Number of people killed by terror attack:', int(sum(killData.dropna())))# drop the NaN values
Number of people killed by terror attack: 411868

Number of people killed by terror attack: 411868

In [52]:
countryData = df.loc[:,'country':'country_txt']
# countyData
countryKillData = pd.concat([countryData, killData], axis=1)
# countryKillData
In [53]:
# pivot table sum kill values for the same country_txt
countryKillFormatData = countryKillData.pivot_table(columns='country_txt', values='nkill', aggfunc='sum')
countryKillFormatData
Out[53]:
country_txt Afghanistan Albania Algeria Andorra Angola Antigua and Barbuda Argentina Armenia Australia Austria ... Vietnam Wallis and Futuna West Bank and Gaza Strip West Germany (FRG) Western Sahara Yemen Yugoslavia Zaire Zambia Zimbabwe
nkill 39384.0 42.0 11066.0 0.0 3043.0 0.0 490.0 37.0 23.0 30.0 ... 1.0 0.0 1500.0 97.0 1.0 8776.0 119.0 324.0 70.0 154.0

1 rows × 205 columns

In [54]:
countryKillFormatData.info()
<class 'pandas.core.frame.DataFrame'>
Index: 1 entries, nkill to nkill
Columns: 205 entries, Afghanistan to Zimbabwe
dtypes: float64(205)
memory usage: 1.6+ KB

I changed this because the view is corrupted when too much data is put into a bar chart. Using 50 data in each plot made everything more clear

In [62]:
# fig_size used to resize the graphic
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=50
fig_size[1]=50
plt.rcParams["figure.figsize"] = fig_size
In [107]:
labels = countryKillFormatData.columns.tolist()
labels = labels[:50] #50 bar provides nice view
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[:50]
values = [int(i[0]) for i in values] # convert float to int
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange'] # color list for bar chart bar color 
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=40)
# print(fig_size)
plt.show()
In [108]:
labels = countryKillFormatData.columns.tolist()
labels = labels[50:101]
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[50:101]
values = [int(i[0]) for i in values]
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange']
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=30)
plt.show()
In [109]:
labels = countryKillFormatData.columns.tolist()
labels = labels[101:152]
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[101:152]
values = [int(i[0]) for i in values]
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange']
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=30)
plt.show()
In [110]:
labels = countryKillFormatData.columns.tolist()
labels = labels[152:206]
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[152:206]
values = [int(i[0]) for i in values]
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange']
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=30)
plt.show()

Terrorist acts in the Middle East and northern Africa have been seen to have fatal consequences. The Middle East and North Africa are seen to be the places of serious terrorist attacks. In addition, even though there is a perception that Muslims are supporters of terrorism, Muslims are the people who are most damaged by terrorist attacks. If you look at the graphics, it appears that Iraq, Afghanistan and Pakistan are the most damaged countries. All of these countries are Muslim countries

FREQUENCY OF TERRORISM ACTION¶

In [112]:
plt.subplots(figsize=(15,6))
sns.barplot(df['iyear'].value_counts().index, df['iyear'].value_counts().values)
plt.xticks(rotation=90)
plt.xlabel('year', fontsize=20)

plt.title('Number Of Terrorist Activities Each Year',fontsize=30)
plt.show()
C:\Users\Admin\.conda\envs\datascience\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(

This bar plot shows us that in there is a huge increase in terrorism during the 2000's as opposed to 1970-2000. 2014 had the most terrorist attacks. The biggest increase in terrorist attacks is from from 2011-2014 and that after 2014, the number of terrorist attacks is decreasing.

TOTAL NUMBER OF PEOPLE KILLED IN TERROR ATTACK¶

According to records, the total number of people killed in terrorist attacks:

In [70]:
killData = df.loc[:,'nkill']
print('Number of people killed by terror attack:', int(sum(killData.dropna())))# drop the NaN values
Number of people killed by terror attack: 411868

Number of people killed by terror attack: 411868

Terrorist Activities by Region in each Year through Area Plot

In [114]:
pd.crosstab(df.iyear, df.region_txt).plot(kind='area',figsize=(15,6))
plt.title('Terrorist Activities by Region in each Year',fontsize=30)
plt.ylabel('Number of Attacks',fontsize=20)
plt.show()

Here we see that South Asia, the Middle East & North Africa, Sub-Saharan-Africa, and South America have the most terrorism attacks. There is also a trend that the Western countries tend to have less terrorism attacks than the 3rd world countries. It is also not surprising to see that regions like South Asia, Middle East, Africa, and South America are the top ranking in terms of terrorism attacks due to the large disparities in wealth, differences in religions, as well as territorial disputes over oil.

TREND OF TERRORIST ATTACK¶

In [113]:
terror_region=pd.crosstab(df.iyear,df.region_txt)
terror_region.plot(color=sns.color_palette('Set2',12))
fig=plt.gcf()
fig.set_size_inches(18,6)
plt.show()

Activity of Top Terrorist Groups¶¶

In [74]:
top_groups10=df[df['gname'].isin(df['gname'].value_counts()[1:11].index)]
pd.crosstab(top_groups10.iyear,top_groups10.gname).plot(color=sns.color_palette('Paired',10))
fig=plt.gcf()
fig.set_size_inches(18,6)
plt.show()

we can see that the too terrorist groups are the ISIL, Taliban,Shining Path and Al-Shabaab. The first 4 groups aren't surprising since they are all in the Middle East & North Africa areas, but New People's Army is the armed wing of the Communist Party in the Philippines

In [75]:
df.city.value_counts().head(15)
Out[75]:
Unknown         9775
Baghdad         7589
Karachi         2652
Lima            2359
Mosul           2265
Belfast         2171
Santiago        1621
Mogadishu       1581
San Salvador    1558
Istanbul        1048
Athens          1019
Bogota           984
Kirkuk           925
Beirut           918
Medellin         848
Name: city, dtype: int64
In [117]:
#Creating new dataframe without Unknown Category
filtered = df[df['city'] != 'Unknown']

#Barplot
plt.subplots(figsize=(15,6))
sns.barplot(filtered['city'].value_counts().head(15).index, filtered['city'].value_counts().head(15).values, 
            palette = "viridis")
plt.xticks(rotation=90)
plt.title('Cities With The Most Terrorist Attacks',fontsize=30)
plt.show()
C:\Users\Admin\.conda\envs\datascience\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(

From the barplot, we see that the top 5 cities with terrorism attacks are Belfast, Baghdad, Karachi, Lima, and San Salvador. It is surprising to see that Belfast, Ireland is the top city for terrorist attacks since according to the interactive map we just created, Ireland isn't one of the countries with the highest terrorist attacks

Terrorist Attack Types¶

In [124]:
#Selecting Columns
data = df[['iyear','country_txt','city','region_txt','attacktype1_txt','weaptype1_txt','targtype1_txt', 'gname']]
data.head()
Out[124]:
iyear country_txt city region_txt attacktype1_txt weaptype1_txt targtype1_txt gname
0 1970 Dominican Republic Santo Domingo Central America & Caribbean Assassination Unknown Private Citizens & Property MANO-D
1 1970 Mexico Mexico city North America Hostage Taking (Kidnapping) Unknown Government (Diplomatic) 23rd of September Communist League
2 1970 Philippines Unknown Southeast Asia Assassination Unknown Journalists & Media Unknown
3 1970 Greece Athens Western Europe Bombing/Explosion Explosives Government (Diplomatic) Unknown
4 1970 Japan Fukouka East Asia Facility/Infrastructure Attack Incendiary Government (Diplomatic) Unknown
In [127]:
import plotly.express as px
#Making a new column called AttackCount which is the sum of each type of terrorist attack and adding it to dataframe
data['AttackCount'] = df.attacktype1_txt.groupby(df.attacktype1_txt).transform('count')

#Creating new Dataframe to get only Attack Type and AttackCount and dropping duplicates from Attack Type
data1 = data.copy()
data2 = data1[['attacktype1_txt','AttackCount']]
data3 = data2.drop_duplicates(keep='first')

#Pie Chart
fig = px.pie(data3, values="AttackCount",
             names="attacktype1_txt",title='Terrorist Attack Types',
             color_discrete_sequence=px.colors.sequential.RdBu)
fig.show()
C:\Users\Admin\AppData\Local\Temp\ipykernel_9084\2084340030.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data['AttackCount'] = df.attacktype1_txt.groupby(df.attacktype1_txt).transform('count')

From the pie chart, the most common type of terrorist weapons used are Bombing/Explosions, followed by Armed Assault, and Assassinations. The reason that bombings/explosions are the most common type of terrorist attack is likely due to how the information to make bombs is easily available, it is easier to use a bomb or explosive than attempt things like armed assault or assassinations, as well as the fact that there aren't things like background checks when it comes to getting a bomb as opposed to trying to get a firearm or get close enough to assassinate someone important.

In [ ]: